Overview

Dataset statistics

Number of variables15
Number of observations271116
Missing cells363853
Missing cells (%)8.9%
Duplicate rows612
Duplicate rows (%)0.2%
Total size in memory31.0 MiB
Average record size in memory120.0 B

Variable types

Numeric5
Text6
Categorical4

Alerts

Dataset has 612 (0.2%) duplicate rowsDuplicates
Height is highly overall correlated with WeightHigh correlation
Weight is highly overall correlated with Height and 1 other fieldsHigh correlation
Year is highly overall correlated with CityHigh correlation
Sex is highly overall correlated with WeightHigh correlation
Season is highly overall correlated with CityHigh correlation
City is highly overall correlated with Year and 1 other fieldsHigh correlation
Age has 9474 (3.5%) missing valuesMissing
Height has 60171 (22.2%) missing valuesMissing
Weight has 62875 (23.2%) missing valuesMissing
Medal has 231333 (85.3%) missing valuesMissing

Reproduction

Analysis started2024-03-20 13:12:08.763264
Analysis finished2024-03-20 13:12:20.654835
Duration11.89 seconds
Software versionydata-profiling vv4.6.0
Download configurationconfig.json

Variables

ID
Real number (ℝ)

Distinct135571
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68248.954
Minimum1
Maximum135571
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:20.753898image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7347.75
Q134643
median68205
Q3102097.25
95-th percentile128978
Maximum135571
Range135570
Interquartile range (IQR)67454.25

Descriptive statistics

Standard deviation39022.286
Coefficient of variation (CV)0.57176387
Kurtosis-1.1972922
Mean68248.954
Median Absolute Deviation (MAD)33738
Skewness-0.0046811565
Sum1.8503384 × 1010
Variance1.5227388 × 109
MonotonicityIncreasing
2024-03-20T14:12:20.888923image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
77710 58
 
< 0.1%
106296 39
 
< 0.1%
115354 38
 
< 0.1%
119591 36
 
< 0.1%
129196 32
 
< 0.1%
44875 32
 
< 0.1%
53240 32
 
< 0.1%
119590 32
 
< 0.1%
89187 32
 
< 0.1%
106156 31
 
< 0.1%
Other values (135561) 270754
99.9%
ValueCountFrequency (%)
1 1
 
< 0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
5 6
< 0.1%
6 8
< 0.1%
7 8
< 0.1%
8 2
 
< 0.1%
9 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
135571 2
< 0.1%
135570 2
< 0.1%
135569 1
< 0.1%
135568 1
< 0.1%
135567 2
< 0.1%
135566 1
< 0.1%
135565 2
< 0.1%
135564 1
< 0.1%
135563 2
< 0.1%
135562 1
< 0.1%

Name
Text

Distinct134732
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:21.165066image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length108
Median length78
Mean length19.34199
Min length2

Characters and Unicode

Total characters5243923
Distinct characters63
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77009 ?
Unique (%)28.4%

Sample

1st rowA Dijiang
2nd rowA Lamusi
3rd rowGunnar Nielsen Aaby
4th rowEdgar Lindenau Aabye
5th rowChristine Jacoba Aaftink
ValueCountFrequency (%)
john 3881
 
0.5%
de 3794
 
0.5%
robert 2597
 
0.4%
william 2329
 
0.3%
james 2027
 
0.3%
peter 2007
 
0.3%
van 1966
 
0.3%
michael 1928
 
0.3%
david 1925
 
0.3%
joseph 1854
 
0.3%
Other values (108718) 716928
96.7%
2024-03-20T14:12:21.570876image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 491302
 
9.4%
470440
 
9.0%
e 428351
 
8.2%
i 348275
 
6.6%
n 348236
 
6.6%
r 334503
 
6.4%
o 298902
 
5.7%
l 233064
 
4.4%
s 189413
 
3.6%
t 165499
 
3.2%
Other values (53) 1935938
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3885189
74.1%
Uppercase Letter 751692
 
14.3%
Space Separator 470440
 
9.0%
Other Punctuation 60783
 
1.2%
Dash Punctuation 39474
 
0.8%
Close Punctuation 18174
 
0.3%
Open Punctuation 18170
 
0.3%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 491302
12.6%
e 428351
11.0%
i 348275
 
9.0%
n 348236
 
9.0%
r 334503
 
8.6%
o 298902
 
7.7%
l 233064
 
6.0%
s 189413
 
4.9%
t 165499
 
4.3%
h 130797
 
3.4%
Other values (16) 916847
23.6%
Uppercase Letter
ValueCountFrequency (%)
M 67864
 
9.0%
A 59013
 
7.9%
S 58822
 
7.8%
J 49127
 
6.5%
B 40397
 
5.4%
K 38545
 
5.1%
C 38204
 
5.1%
R 36780
 
4.9%
G 36229
 
4.8%
L 36200
 
4.8%
Other values (16) 290511
38.6%
Other Punctuation
ValueCountFrequency (%)
" 49602
81.6%
. 6909
 
11.4%
, 2801
 
4.6%
' 1449
 
2.4%
& 19
 
< 0.1%
/ 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
470440
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 39474
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18174
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18170
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4636881
88.4%
Common 607042
 
11.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 491302
 
10.6%
e 428351
 
9.2%
i 348275
 
7.5%
n 348236
 
7.5%
r 334503
 
7.2%
o 298902
 
6.4%
l 233064
 
5.0%
s 189413
 
4.1%
t 165499
 
3.6%
h 130797
 
2.8%
Other values (42) 1668539
36.0%
Common
ValueCountFrequency (%)
470440
77.5%
" 49602
 
8.2%
- 39474
 
6.5%
) 18174
 
3.0%
( 18170
 
3.0%
. 6909
 
1.1%
, 2801
 
0.5%
' 1449
 
0.2%
& 19
 
< 0.1%
/ 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5243923
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 491302
 
9.4%
470440
 
9.0%
e 428351
 
8.2%
i 348275
 
6.6%
n 348236
 
6.6%
r 334503
 
6.4%
o 298902
 
5.7%
l 233064
 
4.4%
s 189413
 
3.6%
t 165499
 
3.2%
Other values (53) 1935938
36.9%

Sex
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
M
196594 
F
74522 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters271116
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
M 196594
72.5%
F 74522
 
27.5%

Length

2024-03-20T14:12:21.699595image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-20T14:12:21.795700image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
m 196594
72.5%
f 74522
 
27.5%

Most occurring characters

ValueCountFrequency (%)
M 196594
72.5%
F 74522
 
27.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 271116
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 196594
72.5%
F 74522
 
27.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 271116
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 196594
72.5%
F 74522
 
27.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 271116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 196594
72.5%
F 74522
 
27.5%

Age
Real number (ℝ)

MISSING 

Distinct74
Distinct (%)< 0.1%
Missing9474
Missing (%)3.5%
Infinite0
Infinite (%)0.0%
Mean25.556898
Minimum10
Maximum97
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:21.894896image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile18
Q121
median24
Q328
95-th percentile37
Maximum97
Range87
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.3935608
Coefficient of variation (CV)0.25016967
Kurtosis6.2706424
Mean25.556898
Median Absolute Deviation (MAD)3
Skewness1.7471225
Sum6686758
Variance40.87762
MonotonicityNot monotonic
2024-03-20T14:12:22.013150image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23 21875
 
8.1%
24 21720
 
8.0%
22 20814
 
7.7%
25 19707
 
7.3%
21 19164
 
7.1%
26 17675
 
6.5%
27 16025
 
5.9%
20 15258
 
5.6%
28 14043
 
5.2%
19 11643
 
4.3%
Other values (64) 83718
30.9%
ValueCountFrequency (%)
10 1
 
< 0.1%
11 13
 
< 0.1%
12 39
 
< 0.1%
13 187
 
0.1%
14 837
 
0.3%
15 2203
 
0.8%
16 3852
 
1.4%
17 5376
2.0%
18 8152
3.0%
19 11643
4.3%
ValueCountFrequency (%)
97 1
 
< 0.1%
96 1
 
< 0.1%
88 3
 
< 0.1%
84 1
 
< 0.1%
81 2
 
< 0.1%
80 3
 
< 0.1%
77 2
 
< 0.1%
76 7
< 0.1%
75 4
 
< 0.1%
74 12
< 0.1%

Height
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct95
Distinct (%)< 0.1%
Missing60171
Missing (%)22.2%
Infinite0
Infinite (%)0.0%
Mean175.33897
Minimum127
Maximum226
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:22.133531image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum127
5-th percentile158
Q1168
median175
Q3183
95-th percentile193
Maximum226
Range99
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.518462
Coefficient of variation (CV)0.059989301
Kurtosis0.17772797
Mean175.33897
Median Absolute Deviation (MAD)7
Skewness0.018477298
Sum36986879
Variance110.63805
MonotonicityNot monotonic
2024-03-20T14:12:22.262713image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
180 12492
 
4.6%
170 11976
 
4.4%
178 10708
 
3.9%
175 10320
 
3.8%
183 8284
 
3.1%
168 8211
 
3.0%
173 7843
 
2.9%
172 7813
 
2.9%
165 7246
 
2.7%
185 6839
 
2.5%
Other values (85) 119213
44.0%
(Missing) 60171
22.2%
ValueCountFrequency (%)
127 7
 
< 0.1%
128 1
 
< 0.1%
130 2
 
< 0.1%
131 2
 
< 0.1%
132 9
 
< 0.1%
133 6
 
< 0.1%
135 14
< 0.1%
136 28
< 0.1%
137 18
< 0.1%
138 20
< 0.1%
ValueCountFrequency (%)
226 3
 
< 0.1%
223 4
 
< 0.1%
221 4
 
< 0.1%
220 6
 
< 0.1%
219 2
 
< 0.1%
218 13
< 0.1%
217 11
< 0.1%
216 12
< 0.1%
215 19
< 0.1%
214 16
< 0.1%

Weight
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct220
Distinct (%)0.1%
Missing62875
Missing (%)23.2%
Infinite0
Infinite (%)0.0%
Mean70.702393
Minimum25
Maximum214
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:22.384105image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile50
Q160
median70
Q379
95-th percentile95
Maximum214
Range189
Interquartile range (IQR)19

Descriptive statistics

Standard deviation14.34802
Coefficient of variation (CV)0.20293542
Kurtosis2.0175229
Mean70.702393
Median Absolute Deviation (MAD)9
Skewness0.79716903
Sum14723137
Variance205.86568
MonotonicityNot monotonic
2024-03-20T14:12:22.513995image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70 9625
 
3.6%
60 7994
 
2.9%
75 7810
 
2.9%
68 7284
 
2.7%
65 7236
 
2.7%
72 6252
 
2.3%
80 6214
 
2.3%
73 5937
 
2.2%
63 5869
 
2.2%
64 5764
 
2.1%
Other values (210) 138256
51.0%
(Missing) 62875
23.2%
ValueCountFrequency (%)
25 6
 
< 0.1%
28 14
 
< 0.1%
30 42
 
< 0.1%
31 23
 
< 0.1%
32 41
 
< 0.1%
33 51
 
< 0.1%
34 73
< 0.1%
35 92
< 0.1%
36 137
0.1%
37 173
0.1%
ValueCountFrequency (%)
214 2
 
< 0.1%
198 1
 
< 0.1%
190 1
 
< 0.1%
182 2
 
< 0.1%
180 1
 
< 0.1%
178 1
 
< 0.1%
176.5 2
 
< 0.1%
175 1
 
< 0.1%
170 5
< 0.1%
167 2
 
< 0.1%

Team
Text

Distinct1184
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:22.721086image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length47
Median length39
Mean length8.4146601
Min length2

Characters and Unicode

Total characters2281349
Distinct characters72
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)< 0.1%

Sample

1st rowChina
2nd rowChina
3rd rowDenmark
4th rowDenmark/Sweden
5th rowNetherlands
ValueCountFrequency (%)
united 19046
 
5.6%
states 18179
 
5.4%
germany 15068
 
4.5%
france 11999
 
3.6%
great 11812
 
3.5%
britain 11414
 
3.4%
italy 10260
 
3.0%
canada 9279
 
2.7%
japan 8289
 
2.5%
sweden 8052
 
2.4%
Other values (1317) 214526
63.5%
2024-03-20T14:12:23.076776image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 324455
14.2%
e 199934
 
8.8%
n 195074
 
8.6%
i 170815
 
7.5%
t 154682
 
6.8%
r 140338
 
6.2%
l 84992
 
3.7%
o 80470
 
3.5%
s 76328
 
3.3%
d 74872
 
3.3%
Other values (62) 779389
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1863640
81.7%
Uppercase Letter 337030
 
14.8%
Space Separator 66808
 
2.9%
Decimal Number 6417
 
0.3%
Dash Punctuation 6260
 
0.3%
Other Punctuation 798
 
< 0.1%
Open Punctuation 198
 
< 0.1%
Close Punctuation 198
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 324455
17.4%
e 199934
10.7%
n 195074
10.5%
i 170815
9.2%
t 154682
8.3%
r 140338
 
7.5%
l 84992
 
4.6%
o 80470
 
4.3%
s 76328
 
4.1%
d 74872
 
4.0%
Other values (16) 361680
19.4%
Uppercase Letter
ValueCountFrequency (%)
S 56302
16.7%
G 32456
9.6%
C 29629
 
8.8%
U 29530
 
8.8%
B 27135
 
8.1%
A 20878
 
6.2%
F 18207
 
5.4%
I 17642
 
5.2%
N 15441
 
4.6%
R 13485
 
4.0%
Other values (16) 76325
22.6%
Decimal Number
ValueCountFrequency (%)
1 2910
45.3%
2 2829
44.1%
3 413
 
6.4%
4 107
 
1.7%
9 35
 
0.5%
7 34
 
0.5%
6 31
 
0.5%
5 23
 
0.4%
8 18
 
0.3%
0 17
 
0.3%
Other Punctuation
ValueCountFrequency (%)
' 227
28.4%
" 226
28.3%
, 138
17.3%
. 127
15.9%
/ 43
 
5.4%
# 37
 
4.6%
Space Separator
ValueCountFrequency (%)
66808
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6260
100.0%
Open Punctuation
ValueCountFrequency (%)
( 198
100.0%
Close Punctuation
ValueCountFrequency (%)
) 198
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2200670
96.5%
Common 80679
 
3.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 324455
14.7%
e 199934
 
9.1%
n 195074
 
8.9%
i 170815
 
7.8%
t 154682
 
7.0%
r 140338
 
6.4%
l 84992
 
3.9%
o 80470
 
3.7%
s 76328
 
3.5%
d 74872
 
3.4%
Other values (42) 698710
31.7%
Common
ValueCountFrequency (%)
66808
82.8%
- 6260
 
7.8%
1 2910
 
3.6%
2 2829
 
3.5%
3 413
 
0.5%
' 227
 
0.3%
" 226
 
0.3%
( 198
 
0.2%
) 198
 
0.2%
, 138
 
0.2%
Other values (10) 472
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2281349
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 324455
14.2%
e 199934
 
8.8%
n 195074
 
8.6%
i 170815
 
7.5%
t 154682
 
6.8%
r 140338
 
6.2%
l 84992
 
3.7%
o 80470
 
3.5%
s 76328
 
3.3%
d 74872
 
3.3%
Other values (62) 779389
34.2%

NOC
Text

Distinct230
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:23.326945image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters813348
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowCHN
2nd rowCHN
3rd rowDEN
4th rowDEN
5th rowNED
ValueCountFrequency (%)
usa 18853
 
7.0%
fra 12758
 
4.7%
gbr 12256
 
4.5%
ita 10715
 
4.0%
ger 9830
 
3.6%
can 9733
 
3.6%
jpn 8444
 
3.1%
swe 8339
 
3.1%
aus 7638
 
2.8%
hun 6607
 
2.4%
Other values (220) 165943
61.2%
2024-03-20T14:12:23.677192image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 96188
11.8%
A 87196
 
10.7%
U 79908
 
9.8%
S 67289
 
8.3%
N 60744
 
7.5%
E 54771
 
6.7%
G 44903
 
5.5%
I 33230
 
4.1%
B 31123
 
3.8%
C 29054
 
3.6%
Other values (16) 228942
28.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 813348
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 96188
11.8%
A 87196
 
10.7%
U 79908
 
9.8%
S 67289
 
8.3%
N 60744
 
7.5%
E 54771
 
6.7%
G 44903
 
5.5%
I 33230
 
4.1%
B 31123
 
3.8%
C 29054
 
3.6%
Other values (16) 228942
28.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 813348
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 96188
11.8%
A 87196
 
10.7%
U 79908
 
9.8%
S 67289
 
8.3%
N 60744
 
7.5%
E 54771
 
6.7%
G 44903
 
5.5%
I 33230
 
4.1%
B 31123
 
3.8%
C 29054
 
3.6%
Other values (16) 228942
28.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 813348
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 96188
11.8%
A 87196
 
10.7%
U 79908
 
9.8%
S 67289
 
8.3%
N 60744
 
7.5%
E 54771
 
6.7%
G 44903
 
5.5%
I 33230
 
4.1%
B 31123
 
3.8%
C 29054
 
3.6%
Other values (16) 228942
28.1%

Games
Text

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:23.810362image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters2982276
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1992 Summer
2nd row2012 Summer
3rd row1920 Summer
4th row1900 Summer
5th row1988 Winter
ValueCountFrequency (%)
summer 222552
41.0%
winter 48564
 
9.0%
1992 16413
 
3.0%
1988 14676
 
2.7%
2000 13821
 
2.5%
1996 13780
 
2.5%
2016 13688
 
2.5%
2008 13602
 
2.5%
2004 13443
 
2.5%
2012 12920
 
2.4%
Other values (27) 158773
29.3%
2024-03-20T14:12:24.030560image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
m 445104
14.9%
271116
9.1%
e 271116
9.1%
r 271116
9.1%
1 225799
7.6%
9 222816
7.5%
S 222552
7.5%
u 222552
7.5%
0 185309
 
6.2%
2 162937
 
5.5%
Other values (10) 481859
16.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1355580
45.5%
Decimal Number 1084464
36.4%
Space Separator 271116
 
9.1%
Uppercase Letter 271116
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 225799
20.8%
9 222816
20.5%
0 185309
17.1%
2 162937
15.0%
8 94098
8.7%
6 87494
 
8.1%
4 57036
 
5.3%
7 22461
 
2.1%
5 15792
 
1.5%
3 10722
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
m 445104
32.8%
e 271116
20.0%
r 271116
20.0%
u 222552
16.4%
i 48564
 
3.6%
n 48564
 
3.6%
t 48564
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
S 222552
82.1%
W 48564
 
17.9%
Space Separator
ValueCountFrequency (%)
271116
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1626696
54.5%
Common 1355580
45.5%

Most frequent character per script

Common
ValueCountFrequency (%)
271116
20.0%
1 225799
16.7%
9 222816
16.4%
0 185309
13.7%
2 162937
12.0%
8 94098
 
6.9%
6 87494
 
6.5%
4 57036
 
4.2%
7 22461
 
1.7%
5 15792
 
1.2%
Latin
ValueCountFrequency (%)
m 445104
27.4%
e 271116
16.7%
r 271116
16.7%
S 222552
13.7%
u 222552
13.7%
W 48564
 
3.0%
i 48564
 
3.0%
n 48564
 
3.0%
t 48564
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2982276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 445104
14.9%
271116
9.1%
e 271116
9.1%
r 271116
9.1%
1 225799
7.6%
9 222816
7.5%
S 222552
7.5%
u 222552
7.5%
0 185309
 
6.2%
2 162937
 
5.5%
Other values (10) 481859
16.2%

Year
Real number (ℝ)

HIGH CORRELATION 

Distinct35
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1978.3785
Minimum1896
Maximum2016
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:24.156957image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1896
5-th percentile1920
Q11960
median1988
Q32002
95-th percentile2016
Maximum2016
Range120
Interquartile range (IQR)42

Descriptive statistics

Standard deviation29.877632
Coefficient of variation (CV)0.015102081
Kurtosis-0.20694758
Mean1978.3785
Median Absolute Deviation (MAD)20
Skewness-0.81773578
Sum5.3637006 × 108
Variance892.67289
MonotonicityNot monotonic
2024-03-20T14:12:24.269633image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
1992 16413
 
6.1%
1988 14676
 
5.4%
2000 13821
 
5.1%
1996 13780
 
5.1%
2016 13688
 
5.0%
2008 13602
 
5.0%
2004 13443
 
5.0%
2012 12920
 
4.8%
1972 11959
 
4.4%
1984 11588
 
4.3%
Other values (25) 135226
49.9%
ValueCountFrequency (%)
1896 380
 
0.1%
1900 1936
 
0.7%
1904 1301
 
0.5%
1906 1733
 
0.6%
1908 3101
1.1%
1912 4040
1.5%
1920 4292
1.6%
1924 5693
2.1%
1928 5574
2.1%
1932 3321
1.2%
ValueCountFrequency (%)
2016 13688
5.0%
2014 4891
 
1.8%
2012 12920
4.8%
2010 4402
 
1.6%
2008 13602
5.0%
2006 4382
 
1.6%
2004 13443
5.0%
2002 4109
 
1.5%
2000 13821
5.1%
1998 3605
 
1.3%

Season
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
Summer
222552 
Winter
48564 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters1626696
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSummer
2nd rowSummer
3rd rowSummer
4th rowSummer
5th rowWinter

Common Values

ValueCountFrequency (%)
Summer 222552
82.1%
Winter 48564
 
17.9%

Length

2024-03-20T14:12:24.382788image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-20T14:12:24.488365image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
summer 222552
82.1%
winter 48564
 
17.9%

Most occurring characters

ValueCountFrequency (%)
m 445104
27.4%
e 271116
16.7%
r 271116
16.7%
S 222552
13.7%
u 222552
13.7%
W 48564
 
3.0%
i 48564
 
3.0%
n 48564
 
3.0%
t 48564
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1355580
83.3%
Uppercase Letter 271116
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 445104
32.8%
e 271116
20.0%
r 271116
20.0%
u 222552
16.4%
i 48564
 
3.6%
n 48564
 
3.6%
t 48564
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
S 222552
82.1%
W 48564
 
17.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 1626696
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 445104
27.4%
e 271116
16.7%
r 271116
16.7%
S 222552
13.7%
u 222552
13.7%
W 48564
 
3.0%
i 48564
 
3.0%
n 48564
 
3.0%
t 48564
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1626696
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 445104
27.4%
e 271116
16.7%
r 271116
16.7%
S 222552
13.7%
u 222552
13.7%
W 48564
 
3.0%
i 48564
 
3.0%
n 48564
 
3.0%
t 48564
 
3.0%

City
Categorical

HIGH CORRELATION 

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
London
22426 
Athina
 
15556
Sydney
 
13821
Atlanta
 
13780
Rio de Janeiro
 
13688
Other values (37)
191845 

Length

Max length22
Median length14
Mean length7.7807912
Min length4

Characters and Unicode

Total characters2109497
Distinct characters45
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBarcelona
2nd rowLondon
3rd rowAntwerpen
4th rowParis
5th rowCalgary

Common Values

ValueCountFrequency (%)
London 22426
 
8.3%
Athina 15556
 
5.7%
Sydney 13821
 
5.1%
Atlanta 13780
 
5.1%
Rio de Janeiro 13688
 
5.0%
Beijing 13602
 
5.0%
Barcelona 12977
 
4.8%
Los Angeles 12423
 
4.6%
Seoul 12037
 
4.4%
Munich 10304
 
3.8%
Other values (32) 130502
48.1%

Length

2024-03-20T14:12:24.738244image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
london 22426
 
6.7%
athina 15556
 
4.6%
sydney 13821
 
4.1%
atlanta 13780
 
4.1%
rio 13688
 
4.1%
de 13688
 
4.1%
janeiro 13688
 
4.1%
beijing 13602
 
4.1%
barcelona 12977
 
3.9%
city 12697
 
3.8%
Other values (41) 189277
56.5%

Most occurring characters

ValueCountFrequency (%)
n 214605
 
10.2%
o 207230
 
9.8%
e 193828
 
9.2%
a 164703
 
7.8%
i 156422
 
7.4%
l 114486
 
5.4%
r 96081
 
4.6%
t 92438
 
4.4%
64084
 
3.0%
s 59391
 
2.8%
Other values (35) 746229
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1719503
81.5%
Uppercase Letter 322407
 
15.3%
Space Separator 64084
 
3.0%
Other Punctuation 2608
 
0.1%
Dash Punctuation 895
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 214605
12.5%
o 207230
12.1%
e 193828
11.3%
a 164703
9.6%
i 156422
9.1%
l 114486
 
6.7%
r 96081
 
5.6%
t 92438
 
5.4%
s 59391
 
3.5%
d 58332
 
3.4%
Other values (15) 361987
21.1%
Uppercase Letter
ValueCountFrequency (%)
A 55786
17.3%
S 47059
14.6%
L 45517
14.1%
M 41210
12.8%
B 33085
10.3%
R 21807
 
6.8%
C 17103
 
5.3%
J 13688
 
4.2%
T 12084
 
3.7%
P 10162
 
3.2%
Other values (6) 24906
7.7%
Other Punctuation
ValueCountFrequency (%)
' 1307
50.1%
. 1301
49.9%
Space Separator
ValueCountFrequency (%)
64084
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 895
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2041910
96.8%
Common 67587
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 214605
 
10.5%
o 207230
 
10.1%
e 193828
 
9.5%
a 164703
 
8.1%
i 156422
 
7.7%
l 114486
 
5.6%
r 96081
 
4.7%
t 92438
 
4.5%
s 59391
 
2.9%
d 58332
 
2.9%
Other values (31) 684394
33.5%
Common
ValueCountFrequency (%)
64084
94.8%
' 1307
 
1.9%
. 1301
 
1.9%
- 895
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2109497
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 214605
 
10.2%
o 207230
 
9.8%
e 193828
 
9.2%
a 164703
 
7.8%
i 156422
 
7.4%
l 114486
 
5.4%
r 96081
 
4.6%
t 92438
 
4.4%
64084
 
3.0%
s 59391
 
2.8%
Other values (35) 746229
35.4%

Sport
Text

Distinct66
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:24.879047image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length25
Median length20
Mean length9.5066577
Min length4

Characters and Unicode

Total characters2577407
Distinct characters47
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowBasketball
2nd rowJudo
3rd rowFootball
4th rowTug-Of-War
5th rowSpeed Skating
ValueCountFrequency (%)
athletics 38624
 
11.5%
gymnastics 27365
 
8.2%
swimming 24104
 
7.2%
skiing 18899
 
5.7%
shooting 11448
 
3.4%
hockey 10933
 
3.3%
cycling 10859
 
3.2%
fencing 10735
 
3.2%
rowing 10595
 
3.2%
skating 9445
 
2.8%
Other values (68) 161473
48.3%
2024-03-20T14:12:25.168578image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 311765
 
12.1%
n 236992
 
9.2%
t 196051
 
7.6%
e 156041
 
6.1%
s 149741
 
5.8%
g 144194
 
5.6%
l 142623
 
5.5%
o 133948
 
5.2%
c 111472
 
4.3%
a 103764
 
4.0%
Other values (37) 890816
34.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2178883
84.5%
Uppercase Letter 334820
 
13.0%
Space Separator 63364
 
2.5%
Dash Punctuation 340
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 311765
14.3%
n 236992
10.9%
t 196051
 
9.0%
e 156041
 
7.2%
s 149741
 
6.9%
g 144194
 
6.6%
l 142623
 
6.5%
o 133948
 
6.1%
c 111472
 
5.1%
a 103764
 
4.8%
Other values (15) 492292
22.6%
Uppercase Letter
ValueCountFrequency (%)
S 84409
25.2%
A 53391
15.9%
C 40724
12.2%
G 27612
 
8.2%
B 21451
 
6.4%
F 20715
 
6.2%
W 15107
 
4.5%
H 14598
 
4.4%
R 11730
 
3.5%
T 9763
 
2.9%
Other values (10) 35320
10.5%
Space Separator
ValueCountFrequency (%)
63364
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 340
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2513703
97.5%
Common 63704
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 311765
 
12.4%
n 236992
 
9.4%
t 196051
 
7.8%
e 156041
 
6.2%
s 149741
 
6.0%
g 144194
 
5.7%
l 142623
 
5.7%
o 133948
 
5.3%
c 111472
 
4.4%
a 103764
 
4.1%
Other values (35) 827112
32.9%
Common
ValueCountFrequency (%)
63364
99.5%
- 340
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2577407
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 311765
 
12.1%
n 236992
 
9.2%
t 196051
 
7.6%
e 156041
 
6.1%
s 149741
 
5.8%
g 144194
 
5.6%
l 142623
 
5.5%
o 133948
 
5.2%
c 111472
 
4.3%
a 103764
 
4.0%
Other values (37) 890816
34.6%

Event
Text

Distinct765
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2024-03-20T14:12:25.392483image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length85
Median length58
Mean length32.063335
Min length15

Characters and Unicode

Total characters8692883
Distinct characters69
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowBasketball Men's Basketball
2nd rowJudo Men's Extra-Lightweight
3rd rowFootball Men's Football
4th rowTug-Of-War Men's Tug-Of-War
5th rowSpeed Skating Women's 500 metres
ValueCountFrequency (%)
men's 182260
 
15.0%
women's 71916
 
5.9%
metres 70024
 
5.7%
athletics 38624
 
3.2%
gymnastics 27365
 
2.2%
individual 25476
 
2.1%
swimming 24136
 
2.0%
hockey 21866
 
1.8%
team 20722
 
1.7%
skiing 18899
 
1.6%
Other values (428) 717273
58.9%
2024-03-20T14:12:25.776457image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
947445
 
10.9%
e 917589
 
10.6%
n 614723
 
7.1%
s 597800
 
6.9%
i 522595
 
6.0%
t 416689
 
4.8%
l 413207
 
4.8%
o 383702
 
4.4%
a 323320
 
3.7%
m 306751
 
3.5%
Other values (59) 3249062
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6076066
69.9%
Uppercase Letter 1033959
 
11.9%
Space Separator 947445
 
10.9%
Other Punctuation 334267
 
3.8%
Decimal Number 273212
 
3.1%
Dash Punctuation 26342
 
0.3%
Open Punctuation 794
 
< 0.1%
Close Punctuation 794
 
< 0.1%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 917589
15.1%
n 614723
10.1%
s 597800
9.8%
i 522595
8.6%
t 416689
 
6.9%
l 413207
 
6.8%
o 383702
 
6.3%
a 323320
 
5.3%
m 306751
 
5.0%
r 286237
 
4.7%
Other values (15) 1293453
21.3%
Uppercase Letter
ValueCountFrequency (%)
M 215454
20.8%
S 122248
11.8%
W 96251
9.3%
A 74247
 
7.2%
F 64464
 
6.2%
C 53817
 
5.2%
H 52982
 
5.1%
T 50659
 
4.9%
R 49417
 
4.8%
B 48549
 
4.7%
Other values (15) 205871
19.9%
Decimal Number
ValueCountFrequency (%)
0 155854
57.0%
1 39994
 
14.6%
5 25528
 
9.3%
4 25157
 
9.2%
2 14790
 
5.4%
3 5109
 
1.9%
8 3429
 
1.3%
7 1944
 
0.7%
6 1332
 
0.5%
9 75
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
' 254702
76.2%
, 76004
 
22.7%
. 2365
 
0.7%
/ 1196
 
0.4%
Space Separator
ValueCountFrequency (%)
947445
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26342
100.0%
Open Punctuation
ValueCountFrequency (%)
( 794
100.0%
Close Punctuation
ValueCountFrequency (%)
) 794
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7110025
81.8%
Common 1582858
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 917589
 
12.9%
n 614723
 
8.6%
s 597800
 
8.4%
i 522595
 
7.4%
t 416689
 
5.9%
l 413207
 
5.8%
o 383702
 
5.4%
a 323320
 
4.5%
m 306751
 
4.3%
r 286237
 
4.0%
Other values (40) 2327412
32.7%
Common
ValueCountFrequency (%)
947445
59.9%
' 254702
 
16.1%
0 155854
 
9.8%
, 76004
 
4.8%
1 39994
 
2.5%
- 26342
 
1.7%
5 25528
 
1.6%
4 25157
 
1.6%
2 14790
 
0.9%
3 5109
 
0.3%
Other values (9) 11933
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8692883
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
947445
 
10.9%
e 917589
 
10.6%
n 614723
 
7.1%
s 597800
 
6.9%
i 522595
 
6.0%
t 416689
 
4.8%
l 413207
 
4.8%
o 383702
 
4.4%
a 323320
 
3.7%
m 306751
 
3.5%
Other values (59) 3249062
37.4%

Medal
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing231333
Missing (%)85.3%
Memory size2.1 MiB
Gold
13372 
Bronze
13295 
Silver
13116 

Length

Max length6
Median length6
Mean length5.3277531
Min length4

Characters and Unicode

Total characters211954
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGold
2nd rowBronze
3rd rowBronze
4th rowBronze
5th rowBronze

Common Values

ValueCountFrequency (%)
Gold 13372
 
4.9%
Bronze 13295
 
4.9%
Silver 13116
 
4.8%
(Missing) 231333
85.3%

Length

2024-03-20T14:12:25.921744image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-20T14:12:26.028145image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
gold 13372
33.6%
bronze 13295
33.4%
silver 13116
33.0%

Most occurring characters

ValueCountFrequency (%)
o 26667
12.6%
l 26488
12.5%
r 26411
12.5%
e 26411
12.5%
G 13372
6.3%
d 13372
6.3%
B 13295
6.3%
n 13295
6.3%
z 13295
6.3%
S 13116
6.2%
Other values (2) 26232
12.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 172171
81.2%
Uppercase Letter 39783
 
18.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 26667
15.5%
l 26488
15.4%
r 26411
15.3%
e 26411
15.3%
d 13372
7.8%
n 13295
7.7%
z 13295
7.7%
i 13116
7.6%
v 13116
7.6%
Uppercase Letter
ValueCountFrequency (%)
G 13372
33.6%
B 13295
33.4%
S 13116
33.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 211954
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 26667
12.6%
l 26488
12.5%
r 26411
12.5%
e 26411
12.5%
G 13372
6.3%
d 13372
6.3%
B 13295
6.3%
n 13295
6.3%
z 13295
6.3%
S 13116
6.2%
Other values (2) 26232
12.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 211954
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 26667
12.6%
l 26488
12.5%
r 26411
12.5%
e 26411
12.5%
G 13372
6.3%
d 13372
6.3%
B 13295
6.3%
n 13295
6.3%
z 13295
6.3%
S 13116
6.2%
Other values (2) 26232
12.4%

Interactions

2024-03-20T14:12:18.136460image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.207437image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.731157image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.190976image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.656774image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:18.241090image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.337645image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.826126image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.284044image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.755118image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:18.332470image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.435364image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.910048image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.373157image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.850454image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:18.424246image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.525165image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.999457image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.456185image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.939800image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:18.522372image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:16.625869image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.100113image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:17.552872image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2024-03-20T14:12:18.038819image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2024-03-20T14:12:26.099959image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
IDAgeHeightWeightYearSexSeasonCityMedal
ID1.000-0.002-0.011-0.0120.0130.0290.0380.0240.000
Age-0.0021.0000.1450.2170.0010.2510.0840.0900.000
Height-0.0110.1451.0000.8270.0500.4890.1170.0600.019
Weight-0.0120.2170.8271.0000.0090.5370.0700.0500.017
Year0.0130.0010.0500.0091.0000.2920.1620.9120.015
Sex0.0290.2510.4890.5370.2921.0000.0370.2570.000
Season0.0380.0840.1170.0700.1620.0371.0001.0000.000
City0.0240.0900.0600.0500.9120.2571.0001.0000.000
Medal0.0000.0000.0190.0170.0150.0000.0000.0001.000

Missing values

2024-03-20T14:12:18.896517image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-20T14:12:19.481872image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-20T14:12:20.353024image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

IDNameSexAgeHeightWeightTeamNOCGamesYearSeasonCitySportEventMedal
01A DijiangM24.0180.080.0ChinaCHN1992 Summer1992SummerBarcelonaBasketballBasketball Men's BasketballNaN
12A LamusiM23.0170.060.0ChinaCHN2012 Summer2012SummerLondonJudoJudo Men's Extra-LightweightNaN
23Gunnar Nielsen AabyM24.0NaNNaNDenmarkDEN1920 Summer1920SummerAntwerpenFootballFootball Men's FootballNaN
34Edgar Lindenau AabyeM34.0NaNNaNDenmark/SwedenDEN1900 Summer1900SummerParisTug-Of-WarTug-Of-War Men's Tug-Of-WarGold
45Christine Jacoba AaftinkF21.0185.082.0NetherlandsNED1988 Winter1988WinterCalgarySpeed SkatingSpeed Skating Women's 500 metresNaN
55Christine Jacoba AaftinkF21.0185.082.0NetherlandsNED1988 Winter1988WinterCalgarySpeed SkatingSpeed Skating Women's 1,000 metresNaN
65Christine Jacoba AaftinkF25.0185.082.0NetherlandsNED1992 Winter1992WinterAlbertvilleSpeed SkatingSpeed Skating Women's 500 metresNaN
75Christine Jacoba AaftinkF25.0185.082.0NetherlandsNED1992 Winter1992WinterAlbertvilleSpeed SkatingSpeed Skating Women's 1,000 metresNaN
85Christine Jacoba AaftinkF27.0185.082.0NetherlandsNED1994 Winter1994WinterLillehammerSpeed SkatingSpeed Skating Women's 500 metresNaN
95Christine Jacoba AaftinkF27.0185.082.0NetherlandsNED1994 Winter1994WinterLillehammerSpeed SkatingSpeed Skating Women's 1,000 metresNaN
IDNameSexAgeHeightWeightTeamNOCGamesYearSeasonCitySportEventMedal
271106135565Fernando scar ZylberbergM27.0168.076.0ArgentinaARG2004 Summer2004SummerAthinaHockeyHockey Men's HockeyNaN
271107135566James Francis "Jim" ZylkerM21.0175.075.0United StatesUSA1972 Summer1972SummerMunichFootballFootball Men's FootballNaN
271108135567Aleksandr Viktorovich ZyuzinM24.0183.072.0RussiaRUS2000 Summer2000SummerSydneyRowingRowing Men's Lightweight Coxless FoursNaN
271109135567Aleksandr Viktorovich ZyuzinM28.0183.072.0RussiaRUS2004 Summer2004SummerAthinaRowingRowing Men's Lightweight Coxless FoursNaN
271110135568Olga Igorevna ZyuzkovaF33.0171.069.0BelarusBLR2016 Summer2016SummerRio de JaneiroBasketballBasketball Women's BasketballNaN
271111135569Andrzej yaM29.0179.089.0Poland-1POL1976 Winter1976WinterInnsbruckLugeLuge Mixed (Men)'s DoublesNaN
271112135570Piotr yaM27.0176.059.0PolandPOL2014 Winter2014WinterSochiSki JumpingSki Jumping Men's Large Hill, IndividualNaN
271113135570Piotr yaM27.0176.059.0PolandPOL2014 Winter2014WinterSochiSki JumpingSki Jumping Men's Large Hill, TeamNaN
271114135571Tomasz Ireneusz yaM30.0185.096.0PolandPOL1998 Winter1998WinterNaganoBobsleighBobsleigh Men's FourNaN
271115135571Tomasz Ireneusz yaM34.0185.096.0PolandPOL2002 Winter2002WinterSalt Lake CityBobsleighBobsleigh Men's FourNaN

Duplicate rows

Most frequently occurring

IDNameSexAgeHeightWeightTeamNOCGamesYearSeasonCitySportEventMedal# duplicates
37977710Robert Tait McKenzieM65.0NaNNaNCanadaCAN1932 Summer1932SummerLos AngelesArt CompetitionsArt Competitions Mixed Sculpturing, Unknown EventNaN43
40483312Alfred James MunningsM69.0NaNNaNGreat BritainGBR1948 Summer1948SummerLondonArt CompetitionsArt Competitions Mixed Painting, Unknown EventNaN25
4912380Acee Blue EagleM24.0NaNNaNUnited StatesUSA1932 Summer1932SummerLos AngelesArt CompetitionsArt Competitions Mixed Painting, Unknown EventNaN17
36374532Miltiades MannoM53.0NaN76.0HungaryHUN1932 Summer1932SummerLos AngelesArt CompetitionsArt Competitions Mixed Painting, Unknown EventNaN17
41586677Stanisaw NoakowskiM61.0NaNNaNPolandPOL1928 Summer1928SummerAmsterdamArt CompetitionsArt Competitions Mixed Painting, Drawings And Water ColorsNaN17
11428407Wilhelm (William) Hunt DiederichM48.0NaNNaNUnited StatesUSA1932 Summer1932SummerLos AngelesArt CompetitionsArt Competitions Mixed Painting, Unknown EventNaN16
607134046ngel Zrraga ArgellesM41.0NaNNaNMexicoMEX1928 Summer1928SummerAmsterdamArt CompetitionsArt Competitions Mixed Painting, PaintingsNaN16
21244875Alfrd (Arnold-) Hajs (Guttmann-)M50.0NaNNaNHungaryHUN1928 Summer1928SummerAmsterdamArt CompetitionsArt Competitions Mixed Architecture, Architectural DesignsNaN14
21344875Alfrd (Arnold-) Hajs (Guttmann-)M50.0NaNNaNHungaryHUN1928 Summer1928SummerAmsterdamArt CompetitionsArt Competitions Mixed Architecture, Designs For Town PlanningNaN14
6114083Marcel BouraineMNaNNaNNaNFranceFRA1924 Summer1924SummerParisArt CompetitionsArt Competitions Mixed SculpturingNaN13